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Abstract 

Recently  we  have  witnessed  a dramatic  increase  in  both  the  study  and 
use  of  data  bases.  These  activities  have  in  turn  stimulated  interest  in 
data  description  facilities.  The  work  reported  here  was  motivated  by  the 
observation  that  descriptors  for  structured  data  arc  themselves,  in  fact, 
"data- like"  in  many  respects.  This  paper  introduces  a simple  language 
for  relationally-organized  data  and  a companion  descriptor  language,  for 
the  purpose  of  demonstrating  a single  system  of  representation  for  both 
data  constructs  and  their  descriptors.  It  Is  assumed  that  each  data  con- 
struct (l.e.,  relation)  belongs  to  a previously  defined  relation  class 
and  that  a descriptor  Is  used  to  define  the  structure  of  the  elements  of 
each  class.  The  system  presented  here  is  "closed"  in  the  sense  that  it 
allows  each  descriptor  to  be  represented  as  a relation.  Its  adoption 
would  permit  data  description  facilities  to  be  implemented  in  terms  of 
data  manipulation  facilities. 


Key  words  and  Phrases:  Data  description,  relation,  descriptor,  relation 

class,  closed  descriptive  system. 
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Introduction 


Non-procedural  data  description  languages  (DDLs)  have  emerged  in 
conjunction  with  more  flexible  schemes  for  organizing  data.  Some  DDLs 
fe.g.,  [1,2])  were  tied  to  a particular  procedural  host  language  (e.g., 
COBOL).  Others  (e.g.,  [3A3)  supported  only  specific  classes  of  data 
structures.  The  most  general  DDLs  (e.g.,  [5,6])  were  host  language 
independent  and/or  embraced  a wide  spectrum  of  data  organizations  (e.g., 
hierarchical,  relational,  network).  In  each  case,  however,  the  facilities 
for  constructing  data  descriptors  were  kept  separate  from  those  for 
handling  the  data  itself. 

The  work  reported  here  is  an  adaptation  of  methods  applied  earlier 
to  hierarchical  data  structures  [yI • H was  motivated  by  the  observation 
that  descriptors  for  structured  data  are  themselves,  in  fact,  "ilata-like" 
in  many  respects.  We  introduce  a simple  data  language  and  a companion 
descriptor  language,  for  the  purpose  of  demonstrating  a single  closed 
system  of  representation  for  both  data  constructs  (i.e.,  relations)  and 
their  descriptors.  It  is  assumed  that  each  relation  belongs  to  a previously 
defined  relation  class  and  that  a descriptor  is  used  to  define  the  structure 
of  the  elements  of  each  class. 

The  system  is  "closed"  in  the  sense  that  it  allows  each  descriptor 
to  be  represented  in  terms  of  a relation  whose  structure  is,  in  turn, 
specified  by  means  of  a second-level  descriptor.  Closure  is  achieved 


because  a single  relation  class  is  found  whose  own  structure  is  sufJ'icit?ntly 
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general  to  accomodate  the  data  representation  of  any  descriptor, 
including  its  own.  Transformations  are  defined,  relating  the 
descriptor  and  data  representations  for  a descriptor. 

An  obvious  benefit  could  be  derived  from  a scheme  such  as  the 
one  proposed  here.  Its  implementation  would  permit  a single  set  of 
data  manipulation  facilities  (e.g.,  [4,^])  to  suffice  both  for 
building  elements  of  existing  relation  classes  and  for  defining 
new  relation  classes  themselves. 

A Relational  Data-Descriptor  Language  Pair 

We  require  a framework  within  which  to  discuss  the  relationship 
between  data  and  the  descriptors  which  serve  to  define  its  structure. 

Let  us  therefore  Introduce  a language  of  relational  data  structures  (l.e., 
relations)  and  a complementary  descriptor  language.  It  is  Important  to 
remark  at  this  point  that  this  paper  is  deliberately  incomplete  in  its 
treatment  of  relational  data  bases;  we  are  focusing  here  upon  certain 
aspects  of  the  description  of  relationally  organized  data,  whereas  we  ignore 
its  manipulation.  For  a more  thorough  treatment  of  the  theory  and  potential 
of  the  relational  approach  to  data  management,  the  reader  is  referred  to 

We  consider  a data  language  consisting  of  relations  defined  over 
(not  necessarily  distinct)  sets  (called  domains)  Dj,  •••>  To  be 

consistent  with  [9]^  we  shall  require  that  each  contain  only  simple, 
non- aggregate  values;  integers  and  identifiers  would  be  examples  of  such 
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doualns.  A relation  R over  D, , D„,  D is  a subset  of  the 

1 2 

cartesian  product  D,  X X . . . X D . The  number  of  domains  involved. 

P 1 2 

n„,  is  referred  to  as  the  degree  of  R.  Stated  another  way,  R is 
R 

a set  of  tuples,  each  of  the  form  <dj^,  d^,  ...,  > where  d^  € D^. 

R 

For  convenience  (and  to  distinguish  between  nondlstlnct  domains),  a 
unique  role  name,  id  , is  associated  with  each  domain,  D , which  under- 
lies  R.  A role  name  is  used  for  accessing  the  corresponding  component  of 
any  tuple  r 6 R (i.e.,  selects  the  k-th  component  of  r).  We  shall 

denote  the  current  set  of  relations  by  31 . 

We  shall  define  the  structure  of  each  relation  by  means  of  descriptor. 
A separate  language  is  used  for  formulating  these  descriptors.  This 
language  must  allow  us  to  specify  the  domains  and  role  names  from  which  a 
relation  is  built.  To  define  a relation  R € SI  we  use  a descriptor  dsfR) 
which  takes  the  form  of  a tuple  of  named  domains 

ds (r)  = ^^2  ' ^2*  ***>  ^*^n  " ^n  ^ 

R R 

where  each  Dj^  is  a domain  and  the  corresponding  idj^  is  its  role  name 
relative  to  R.  We  let  ® denote  the  current  set  of  descriptors.  It  is 
important  to  distinguish  between  S and  the  set  of  all  possible  descriptors; 
while  the  latter  depends  only  upon  the  descriptor  language,  the  former 
depends  upon  ^ as  well.  We  can  express  the  assumed  relationship  between 
data  (relations)  and  descriptors  in  terms  of  a mapping 

DESCRIBED  BY  : # - ® (^0 


where  as  we  have  observed  above  K and  ® vary  with  time.  Each  descriptor  D 
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actually  defines  the  structure  of  set  of  relations,  called  a relation 


class.  Formally  speaking,  the  relation  class  Is  the  Inverse  Image  of 


D under  DESCRIBED_BY,  l.e. 


DESCRIBED_BY  ^ : 3)  -»  {relation  classes}. 


Fig.  1 displays  an  example  relation,  EMP,  and  the  corresponding 


descriptor  ds(EMP);  EMP  might  be  used  to  hold  employee  Information.  IMP 


Is  defined  over  four  nondlstlnct  domains:  two  Instances  each  of  a domain 


name  and  a domain  Integer.  For  convenience  EMP  is  displayed  using  a 


tabular  format  with  one  column,  labelled  by  a role  name,  for  each  domain 


(instance).  It  should  be  noted  that  the  order  of  appearance  of  the  rows 


(l.e.,  tuples)  in  the  table  Is  irrelevant. 


A Data  Representation  for  Descriptors 


Within  the  framework  established  In  the  previous  section,  any  extension 


to  S to  define  a new  data  relation  class  requires  that  a new  descriptor  D be 


created  and  included  in  3).  We  discuss  here  a transformation  technique  whereby 


any  descriptor  can  be  represented  as  a relation.  The  importance  of  this  tech- 


nique Is  that  It  facilitates  extensions  to  3(  by  reducing  the  problem  of 


creating  a new  descriptor  to  one  of  building  an  element  of  a previously  defined 


relation  class.  We  let  rn(D)  denote  the  relation  by  which  we  represent  D. 


For  a relation  to  be  used  to  represent  descriptor  information.  Its  own 


structure  must  be  specified  by  a descriptor.  This  would  seem  to  lead. 


unfortunately,  to  a system  requiring  an  Infinite  number  of  descriptor  levels. 


(l.e.,  a relation  R,  described  by  a descriptor  ds(R),  represented  us  a 


relation  rn(ds(u)),  described  by  a descriplor  dsf  i ii((i.:f  k) ^ j , ...j.  |Iow«  v<t 


I 

; 


this  situation  can  be  avoided  if  a single  relation  class  can  be  found  to 
accomodate  the  data  representation  of  any  descriptor,  including  the  one 
that  defines  the  structure  of  (each  element  of)  the  relation  class  itself. 

We  recall  from  eq.  (l)  that  a descriptor  D is  an  n- tuple  of  identifier- 
domain  pairs.  We  can  represent  each  element  of  D by  means  of  a 
of  the  form 

<k,  idj^,  domainj^>  (4) 

where  k indexes  a position  within  D.  Having  made  this  observation,  let  us 
now  define  a relation,  denoted  by  £n(o)  whose  tuples  are  given  by  eq.  (3) 
for  k = 1,2, ... ,n.  For  example  the  result  of  applying  this  technique  to  the 
descriptor  ds^tMP)  in  Fig.  1 is 


rn(ds(EMP))  = 


INDEX 

ROLE 

DOM 

1 

NAME 

name 

2 

SAL 

integer 

3 

MGR 

name 

4 

CODE 

integer 

We  note  that  for  any  descriptor  D,  rn(o)  has  degree  3 and  is  defined 

▲ A ^ 

over  domains  integer. ' ide  > /fier,  and  domain;  as  in  eq.  (5),  we  shall  use 
INDEX,  BOLE,  and  IX)M  as  their  respective  role  nair  >s.  The  descriptor  for 
rn(D),  namely  ^(rn(u)),  is  given  by 


<1NDEX ; integer,  ROLE;  identifier,  DOM; domain.: 


Note  that  both  of  these  domains  serve  as  candidate  keys  [U]  for  such 
a relation. 

This  use  of  a domain- valued  domain  (l.e.,  a domain  whose  elements  are 
domain  names)  Is  very  much  akin  to  the  use  of  mode- valued  modes  In 
programming  languages  such  as  ALGOL  68  [lO]. 
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The  important  thing  to  notice  about  eq.  (6)  is  that  it  does  not  depend 
upon  D.  So  that  regardless  of  the  original  descriptor  D to  which  this 
representation  scheme  is  applied,  it  always  yields  a relation  in  the 
relation  class  defined  by  the  descriptor  in  eq.  (6).  For  convenience, 
let  us  denote  this  "descriptor-descriptor"  by  To  demonstrate 

that  we  have  indeed  achieved  the  desired  closure  with  respect  to  descriptor 
levels,  we  apply  the  transfonnation  rn( • ) to  Dgypgj.  yielding 


rn(D  ) = 
— ' super' 


INDEX 


INDEX  integer 
ROLE  identifier 

DOM  domain 


and  we  observe  that  ® fixedpoint  [ll]  of  the  composite  trans- 

formation ds(rn(*)),  i.e.. 


ds(rn(D  ))  = D 
— — super  super 


Clearly  it  is  a straightforward  matter  to  define  an  inverse  transformation 
to  rn  for  recovering  the  descriptor  representation  of  a descriptor  from  its 
data  representation.  Fig.  2 diagrams  the  relationships  that  exist  between 
the  various  constructs  arising  from  a typical  data  relation  (e.g.,  EMP). 


Conclusions 

We  have  examined  a transformation  technique  by  which  descriptors  can  be 
represented  as  data  relations.  Although  the  technique  is  demonstrated  only 
for  a particular  choice  of  data  and  descriptor  languages.  It  Is  clear  that 


1 


it  can  be  generalized  to  a broad  class  of  data/descriptor  language  pairs. 

It  is  in  fact  applicable  to  any  pair  in  which  the  data  language  constructs 
are  sufficiently  powerful  to  encode  the  variability  of  structure  exhibited 
by  the  set  of  descriptors. 

The  result  of  employing  such  a technique  is  a single  closed  system 
of  representation  for  both  data  and  the  descriptors  which  define  the  structure 
of  that  data.  The  benefit  derived  from  adopting  this  kind  of  system  is  that 
it  would  allow  a d6ta  description  facility  to  be  completely  subsumed  by 
(or  defined  in  terms  of)  a data  manipulation  facility. 
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19.  KEV  WORDS  (Continued) 


20  abstract  (Continued) 


aiod  that  a descriptor  is  used  ^ def^e  the  structure  of  the  elements  of  each  class. 
The  system  presented  here  is  '^osed*^  In  the  sense  that  It  allows  each  descriptor 
to  be  represented  as  a relation.  Its  adoption  would  permit  data  description 
facilities  to  be  Implemented  in  terms  of  data  manipulation  facilities. i 
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